A hidden Markov model-based missing data imputation approach
نویسندگان
چکیده
The accuracy of automatic speech recognizer degrades rapidly when speech was distorted by noise. Robustness against noise arises to be one of the challenge problems. In this paper, a hidden Markov model (HMM) based data imputation approach is presented to improve speech recognition robustness against noise at the front-end of recognizer. Considering the correlation between different filter-banks, the approach realizes missing data imputation by a HMM of L states, each of which has a Gaussian output distribution with full covariance matrix. “Missing” data in speech filter-bank vector sequences are recovered by MAP procedure from local optimal state path or marginal Viterbi decoded HMM state sequence. The potential of the approach was tested using speaker independent continuous mandarin speech recognizer with syllable-loop of perplexity 402 for both Gaussian and babble noises each at 6 different SNR levels ranging from 0dB to 25dB, showing a significant improvement in robustness against additive noises.
منابع مشابه
Accuracy evaluation of different statistical and geostatistical censored data imputation approaches (Case study: Sari Gunay gold deposit)
Most of the geochemical datasets include missing data with different portions and this may cause a significant problem in geostatistical modeling or multivariate analysis of the data. Therefore, it is common to impute the missing data in most of geochemical studies. In this study, three approaches called half detection (HD), multiple imputation (MI), and the cosimulation based on Markov model 2...
متن کاملAudio Imputation Using the Non-negative Hidden Markov Model
Missing data in corrupted audio recordings poses a challenging problem for audio signal processing. In this paper we present an approach that allows us to estimate missing values in the time-frequency domain of audio signals. The proposed approach, based on the Nonnegative Hidden Markov Model, enables more temporally coherent estimation for the missing data by taking into account both the spect...
متن کاملState based imputation of missing data for robust speech recognition and speech enhancement
Within the context of continuous-density HMM speech recognition in noise, we report on imputation of missing time-frequency regions using emission state probability distributions. Spectral subtraction and local signal–to– noise estimation based criteria are used to separate the present from the missing components. We consider two approaches to the problem of classification with missing data: ma...
متن کاملچند رویکرد برخورد با مقادیر گمشده متغیرهای کمی و بررسی اثر آنها بر نتایج حاصل از یک کارآزمایی بالینی
Background and Objectives: A major challenge that affects the longitudinal studies is the problem of missing data. Missing in the data may result in the loss of part of the information which reduces the accuracy of the estimator and obtain the results will be biased and inaccurate. Therefore, it is necessary to evaluate the missing data mechanism from a longitudinal research and to consider thi...
متن کاملAn Empirical Comparison of Performance of the Unified Approach to Linearization of Variance Estimation after Imputation with Some Other Methods
Imputation is one of the most common methods to reduce item non_response effects. Imputation results in a complete data set, and then it is possible to use naϊve estimators. After using most of common imputation methods, mean and total (imputation estimators) are still unbiased. However their variances (imputation variances) are underestimated by naϊve variance estimators. Sampling mechanism an...
متن کامل